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In this paper, we present a multidimensional trellis coded modulation scheme for 



Abstract 

o 

o 

^Ni a high rate 2x2 multiple- input multiple-output system over slow fading channels. 

Set partitioning of the Golden code [9] is designed specifically to increase the minimum 
determinant. The branches of the outer trellis code are labeled with these partitions and 
Viterbi algorithm is applied for trellis decoding. In order to compute the branch metrics 
a sphere decoder is used. The general framework for code design and optimization is 
given. Performance of the proposed scheme is evaluated by simulation and it is shown 
that it achieves significant performance gains over uncoded Golden code. 

Index terms: Lattice, set partitioning, trellis coded modulation, Golden code, diversity, 

O ■ coding gain, minimum determinant. 

CO 

> '. 1 Introduction 

co 

Space-time codes were proposed in [1] as a combination of channel coding with transmit 
diversity techniques in order to enhance data rates and reliability in multi-antenna wireless 
communications systems. In the coherent scenario, where the channel state information (CSI) 
Y? is available at the receiver, the design criteria for space-time codes in slow fading channels 

were developed: rank and determinant criteria [1]. The design criteria aim to maximizing 
the minimum rank and determinant of the codeword distance matrix in order to maximize 
the diversity and coding gains. This in turn guarantees the best possible asymptotic slope of 
the error performance curve on a log-log scale, as well as a shift to the left of the curve. 

Subsequent works resulted new space-time trellis codes, orthogonal space-time block codes 
[2,4], etc. In particular, orthogonal space-time block codes attracted a lot of interest due to 
their low decoding complexity and high diversity gain. Further work produced full diversity, 
full rate algebraic space-time block codes for any number of transmit antennas, using number 
theoretical methods [5-7] . A general family of full rank and full rate linear dispersion space- 
time block codes based on cyclic division algebras was proposed in [8]. However, all the above 
coding schemes do not always exploit the full potential of the multiple-input multiple-output 
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(MIMO) system in terms of diversity-multiplexing gain trade-off [3]. In [9], the Golden code 
was proposed as a full rate and full diversity code for 2x2 MIMO systems with non-vanishing 
minimum determinant (NVD). It was shown in [10] how this property guarantees to achieve 
the diversity-multiplexing gain trade-off. 

In this work we focus on the slow fading model, where it is assumed that the channel 
coefficients are fixed over the duration of a fairly long frame. In such a case, in order to 
reduce the decoding complexity, concatenated coding schemes are appropriate. Space-time 
trellis codes (STTCs) transmitting PSK or QAM symbols from each antenna were designed 
according to both rank and determinant criteria [1]. A more flexible design, using a concate- 
nated scheme, enables to separate the optimization of the two design criteria. As an inner 
code, we can use a simple space-time block code, which can guarantee full diversity for any 
spectral efficiency (e.g. Alamouti code [2]). An outer code is then used to improve the coding 
gain. Essentially two approaches are available: 

1. bit-interleaved coded modulation (BICM) using a powerful binary code and computing 
bit reliability (soft outputs) for the inner code; 

2. trellis coded modulation (TCM) using set partitioning of the inner code. 

The first approach requires a soft output decoder of the inner code, which can have high 
complexity as the spectral efficiency increases. The second approach, considered in this 
paper, overcomes above limitations and is appropriate for high data rate systems. We note 
how the NVD property for the inner code is essential when using a TCM scheme: such 
schemes usually require a constellation expansion, which will not suffer from a reduction of 
the minimum determinant. This advantage is not available with Super-orthogonal space-time 
trellis codes proposed in [12]. 

A first attempt to concatenate the Golden code with an outer trellis code was made in 
[18]. Set partitioning of the inner code was used to increase the minimum determinant of 
the inner codewords, which label the branches of the outer trellis code. The resulting ad hoc 
scheme suffered from a high trellis complexity. 

In this paper, we develop general framework for code design and optimization for Golden 
Space-Time Trellis Coded Modulation (GST-TCM) schemes. In [13-16], lattice set partition- 
ing, combined with a trellis code, is used to increase the minimum square Euclidean distance 
between codewords. Here, it is used to increase the minimum determinant. The Viterbi algo- 
rithm is used for trellis decoding, where the branch metrics are computed by using a lattice 
sphere decoder [11] for the inner code. 



We consider partitions of the Golden code with increasing minimum determinant. In 
turn, this corresponds to a Z 8 lattice partition, which is labeled by using a sequence of nested 
binary codes. The resulting partitions are selected according to a design criterion that is 
similar to Ungerboeck design rules [14,19]. We design different GST-TCMs and optimize 
their performance according to the design criterion. 

For example, we will show that 4 and 16 state TCMs achieve significant performance 
gains of 3dB and 4.2dB, at frame error rate (FER) of 10~ 3 , over the uncoded Golden code at 
spectral efficiencies of 7 and 6 bits per channel use (bpcu), respectively. 

The rest of the paper is organized as follows. Section 2 introduces the system model. 
Section 3 presents a set partitioning of the Golden code which increases the minimum de- 
terminant. Section 4 the GST-TCM presents design criteria and various examples of our 
scheme. Conclusions are drawn in Section 5. 

The following notations are used in the paper. Let T denote transpose and t denote 
Hermitian transpose. Let Z, Q, C and Z[i] denote the ring of rational integers, the field of 
rational numbers, the field of complex numbers, and the ring of Gaussian integers, where 
i 2 = — 1. Let GF(2) = {0, 1} denote the binary Galois field. Let Q(#) denote an algebraic 
number field generated by the primitive element 9. The real and imaginary parts of a complex 
number are denoted by 3?(-) and S(-). The m x m dimensional identity matrix is denoted 
by I m . The m x n dimensional zero matrix is denoted by mxn . The Frobenius norm of a 
matrix is denoted by || • \\p. Let Z 8 be the 8-dimensional integer lattice and let D4 and E 8 
(Gosset lattice) denote the densest sphere packing in 4 and 8 dimensions [21]. 

2 System Model 

We consider a 2 x 2 (n? = 2, iir = 2) MIMO system over slow fading channels. The received 
signal matrix Y e C 2x2L , where 2L is the frame length, is given by 

Y = HX + Z, (1) 

where Z e (Q2x2L j g ^e complex white Gaussian noise with i.i.d. samples ~ Ac(0,A r ), 
H e C 2x2 is the channel matrix, which is constant during a frame and varies independently 
from one frame to another. The elements of H are assumed to be i.i.d. circularly symmetric 
Gaussian random variables ~ Ac(0, 1). The channel is assumed to be known at the receiver. 
In (1), X = [X 1 ,...,X t , ...,X L ] E C 2x2L is the transmitted signal matrix, where X t E C 2x2 . 
There are three different options for selecting inner codewords X t , t — 1, . . . , L: 



1. X t is a codeword of the Golden code Q, i.e., 

(2) 
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a (a + 60) a (c + d0) 
ia (c + d0j a (a + 60) 



where a, 6, c, d £ 7j[i] are the information symbols, 0=1 — 0= 1+ ^ , a = 1 + z — id, 
a — 1 + i(l — 0), and the factor -4= is necessary for energy normalizing purposes [9]. 

2. X 4 are independently selected from a linear subcode of the Golden code; 

3. A trellis code is used as the outer code encoding across the symbols X t , selected from 
partitions of Q. 

We denote Case 1 as the uncoded Golden code, Case 2 as the Golden subcode, and Case 3 as 
the Golden space-time trellis coded modulation. 

In this paper, we use Q-QAM constellations as information symbols in (2), where Q = 2 V . 
We assume the constellation is scaled to match Z[z] + (1 + i)/2, i.e., the minimum Euclidean 
distance is set to 1 and it is centered at the origin. For example, the average energy is 
E s = 0.5,1.5,2.5,5,10.5 for Q = 4,8,16,32,64. Without loss of generality, we will neglect 
the translation vector (1 + z)/2 and assume the Q-QAM constellation is carved from Z[z], 
using a square (or cross-shaped) bounding region £>qam, typical for QAMs. For convenience 
in our analysis, we will choose £>qam to be in the positive quadrant. In order to minimize the 
transmitted energy of this constellation, we center it with by adding a suitable translation. 

Signal to noise ratio is defined as SNR = utE^/Nq, where E\, = E s /q is the energy per 
bit and q denotes the number of information bits per symbol. We have A^o = 2a 2 , where a 2 
is the noise variance per real dimension, which can be adjusted as a 2 = (nTEi,/2)10 ( -~ SNR / 10 \ 

Assuming that a codeword X is transmitted, the maximum-likelihood receiver might 
decide erroneously in favor of another codeword X. Let r denote the rank of the codeword 
difference matrix X — X. Since the Golden code is a full rank code, we have r = n? = 2. 

Let Xj,j = l,...,r, be the eigenvalues of the codeword distance matrix A = (X — 

X)(X — X)L Let A = ]~[\j be the determinant of the codeword distance matrix A and 
A min be the corresponding minimum determinant, which is defined as 

A min = min det (A) . (3) 

The pairwise error probability (PWEP) is upper bounded by 

/p \ -n T n R 

P (X - X) < (A min )"- (j±) (4) 



where u^ur is the diversity gain and (A min ) ' nT is the coding gain [1]. In the case of linear 
codes analyzed in this paper, we can simply consider the all-zero codeword matrix and we 
have 

A min = min |det(XX f )| 2 . (5) 

X^02 X 2L 

In order to compare two coding schemes for the ht x ur MIMO system, supporting 
the same information bit rate, but different minimum determinants (A minl and A min2 ) and 
different constellation energies (E St i and E s ^), we define the asymptotic coding gain as 

VA min ,i/£,,i 

las -vs^/s. ta 

We will only consider the case with ur = 2, which enables to exploit the full power of the 
Golden code with the minimum number of receive antenna. Adding extra receive antennas 
can increase the receiver diversity and hence performance at the cost of higher complexity. 

Performance of both uncoded Golden code (Case 1) and Golden subcode (Case 2) systems 
can be analyzed for L — 1. The Golden code Q has full rate, full rank, and the minimum 
determinant is <5 m i n = \ [9]; thus, for Case 1, A min = <5 min . For Case 2, a linear subcode of Q is 
selected such that A m j n > 1/5. For GST-TCM (Case 3) we consider L > 1 and the minimum 
determinant can be written as 



det(±( Xt Xl)\ 



A mln = min det(XX')= min det ( Y] (X t x\ ) } . (7) 

X^02x2L X^02x2L 



A code design criterion attempting to maximize A m i n is hard to exploit, due to the non- 
additive nature of the determinant metric in (7). Since X t X] are positive definite matrices, 
we use the following determinant inequality [22]: 



A min > min VdetfAjX/) = A; 



X^0 2x 2i 



The lower bound A min will be adopted as the guideline of our concatenated scheme design. In 
particular we will design trellis codes that attempt to maximize A min , by using set partitioning 
to increase the number and the magnitude of non zero terms det ( X t X t ) in (8). 

Note that our design criterion is based on the optimization of an upper bound to the 
upper bound on the worst case pairwise error probability in (4). Nevertheless, simulation 
results show that the codes with the largest A min always performed better. 



3 Uncoded Golden code and its subcodes 

In both Case 1 and Case 2, the symbols X t are transmitted independently in each time slot 
t = 1, . . . , L. The subscript t will be omitted for brevity. We recall below the fundamental 
properties of the Golden code deriving from its algebraic structure [9] . 

• Full-rank: the cyclic division algebra structure guarantees that all the codewords have 
full rank (i.e., non zero determinant). 

• Full-rate: the spectral efficiency is of two Q-QAM information symbols per channel 
use, (i.e., 21og 2 (<5) bits/s/Hz) and saturates the two degrees of freedom of the 2x2 
MIMO system. 

• Cubic shaping: this relates to the cubic shape of the vectorized eight- dimensional con- 
stellation and guarantees that no shaping loss is induced by the code. 

• Non-vanishing determinant for increasing Q-QAM size: this property is derived from 
the discrete nature of the infinite Golden code. 

• Minimum determinant 5 min = 1/5: this preserves the coding gain for any Q-QAM size. 

• Achieves the Diversity Multiplexing gain frontier for 2TX-2RX antennas [10] 

These particular properties of the Golden code are the key to its performance improvement 
over all previously proposed codes. The NVD property is especially useful for adaptive 
modulation schemes or whenever we need to expand the constellation to compensate for a 
rate loss caused by an outer code, as in TCM. 

3.1 Uncoded Golden code 

At any time t, the received signal matrix Y = (jjij) G C 2x2 can be written as 

y = Hl + 2, (9) 

where H = (hij) is the channel matrix, X = (xij) the transmitted signal matrix and Z = (zij) 
the noise matrix. Vectorizing and separating real and imaginary parts in (9) yields 

y = ^x + z, (io) 
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, (14) 



where ?i is given in (14) and 



y = [SR(yn), 9f(j/ u ), ^(2/21), ^(2/21), ^(2/12), 3(j/i 2 ), ^(2/22), ^Mf (11) 

z = [9ft(z u ), 3(z n ), K(z 21 ), 3(z 21 ), 3fc(* 12 ), 3(* 12 ), sft(^ 22 ), 3(2 22 )] T (12) 

x = [K(xn), 3(x n ), 3ft(x 2 i), 3(x 21 ), 3?(xi 2 ), 9f(a?i2), ^(^22), ^(^22)]^ (13) 



Lattice decoding is employed to find x such that 
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(15) 



(16) 



is a rotation matrix preserving the shape of the QAM information symbols a, b, c, d. For this 
reason we will identify the Golden code with the rotated lattice RZ 8 = {x = Ru} where 



u = [9?(a), 3(a), &(&), 9(6), 9fc(c), 3(c), 9fc(d), 3(d)] . 



T 



(17) 



The lattice decoding problem can be rewritten as 



min ||y — 7^Ru| 
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Binary code 


^min 
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Z 8 


Co = (8, 8,1) 


"min 


1 


Qx 


Dl 


Ci = (8,6,2) 


^"min 


2 


G2 


E 8 


C 2 = (8,4,4) 


4(5 ■ 


3 


G 3 


Ls 


C 3 = (8,2,4) 


85 m in 


4 


G± = 2g 


2Z 8 


C 4 = (8,0,oo) 


100 m i n 



Table 1: The Golden code partition chain with corresponding lattices, binary codes, and 
minimum determinants. 



3.2 Golden subcodes 

Let us consider a subcode Q\ obtained as right principal ideal of the Golden code Q [18]. In 
particular we consider the subcode Q\ = {XB,X G Q}, where 



B 



i(l 
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1-6 
id 



(19) 



Since B has the determinant of 1 + i, the minimum determinant of Q\ will be 2S n 
Similarly, we consider the subcodes Q\. C Q for k — 1, . . . , 4, defined as 

g k = {XB k ,xeg}, 



(20) 



which provide the minimum determinant 2 fc <5 m i n (see Table 1). 

In the previous section we have seen how the Golden codewords correspond to the rotated 
Z 8 lattice points. Neglecting the rotation matrix R, we can define an isomorphism between 
Q and Z 8 . All the subcodes of Q correspond to particular sublattices of Z 8 which are listed in 
Table 1. In particular, it can be shown that the codewords of $2, when vectorized, correspond 
to Gosset lattice points E 8 (see Appendix I). Similarly, we find that Q\ corresponds to the 
lattice D\ (the direct sum of two four- dimensional Shafli lattices) and Q 3 corresponds to an 
eight-dimensional lattice that is denoted by Lg. Finally, since B A = 2I2, we get the scaled 
Golden code 1Q corresponding to 2Z 8 . 

Appendix II provides a simple overview of two basic techniques, which will play a key 
role in rest of the paper: Construction A for lattices [21] and lattice set partitioning by coset 
codes [15, 16]. 



As described in Appendix II, since the subcodes of Q are nested, the corresponding lattices 
form the following lattice 'partition chain 



Z 8 D Dj DE H D L»D 2Z 8 



(21) 



Any two consecutive lattices A& D A^+i in this chain forms a four way partition, i.e., the 
quotient group A^/Afc+i has order 4. Let [Afc/Afc+i] denote the set of coset leaders of the 
quotient group A fc /A fc+1 . 

The lattices in the partition chain can be obtained by Construction A, using the nested 
sequence of linear binary codes C^ listed in Table 1, where Cq is the universe code, Ci is the 
extended Hamming code or Reed-Muller code RM(1,3), C 3 is a subcode of C 2 , C\ is the dual 
of C3 and C4 is the empty code with only the all-zero codeword, [23] . The generator matrix 
Gk of the code C*. are given by 
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Looking at G\ we can see that C\ is the direct sum of two parity check codes (4,3,2), this 
proves why it yields the lattice D\ by using Construction A. Similarly, since C3 is the direct 
sum of two repetition codes (4,1,4), we can get some insight about the structure of the lattice 

Following the track of [14-16], we consider a partition tree of the Golden code of depth £. 
From a nested subcode sequence Q D Qe D Ge Q +i D • • • D Ge +e, we have the corresponding 
lattice partition chain Z 8 D A io D A£ 0+ i D • • • D ^-e +e where 

K = K+i + [K/K+i] = ■ ■ ■ 

= A lo+e + [A^/A^+i] H h [K to+ ^i/K to+t ] 

= A io+i + [Ci /Ci Q+1 ] H h [Ci 0+ t-i/Ci 0+ i\ 



9 



This results in four way partition tree of depth L Fig. 1 shows an example for £ = 2. 

The coset leaders in [Ck/C^+i] form a group of order 4 isomorphic to the group Z/2Z x 
Z/2Z, which is generated by two binary generating vectors hx and h 2 , i.e., 

[C k /C k+1 ] = {feihi + 6 2 h 2 | 6i, 6 2 e GF(2)} 

If we consider all the lattices in (21) and the corresponding sequence of nested codes C k , we 
have the following quotient codes: 

[Co/Ci] : { "j,) = ^' "' "'"'"'"'"' ^ (22) 

[Ci/C 2 ] : 

[Ca/Cg] : 

[C3/C4] : 

Note that in order to generate any quotient code [Ci /Ci a+ e], we stack the above vectors in 
the generator matrix H (£q, £q + t) defined as 



hf)= 


(0,0,0,0,0,0,0,1) 


h«»»= 


(0,0,0,1,0,0,0,0) 


h«= 


(0,0,0,0,0,1,0,1) 


h (1) - 

Il 2 — 


(0,0,0,0,0,0,1,1) 


u( 2 ) 
h l = 


(0,1,0,1,0,1,0,1) 


h» = 


(0,0,1,1,0,0,1,1) 


hf > = 


(0,0,0,0,1,1,1,1) 


hf = 


(1,1,1,1,1,1,1,1) 



( hf 0) \ 



H(£ ,£ 



h <4>) 

n 2 



hi' 



'o+^-l) 
!o+^-l) 



(23) 



so we can write 



[C 4 /Q 0+ ,] = {(6 , 61, ... , b 2eo+2 e-2, b 2eo+2£ ^)H(£ , £ + £)\b k e GF(2)} . (24) 

For example, to generate [C /C 2 ] we use the four generators to get the 16 coset leaders as 



[C0/C2 



/ h ? \ 1 



(b ,b x ,b 2 ,b 3 )H(0,2) I 6 fc G GF(2),tf(0,2) 



i (0) 

.(1) 



hi 

V h« / j 



(25) 
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Note that since the C 2 = (8,4,4) code is self-dual, i.e., C 2 = C^ [23], we have 



#(2,4) 



I h i 2) \ 

h (2) 
"2 
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V *4 3) J 
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2 ■ 



3.3 Encoding and decoding the Golden subcodes 

In this section, we first show how to carve a cubic shaped finite constellation from the infinite 
lattices corresponding to the Golden subcodes. Construction A (Appendix II) is the design 
tool that also simplifies bit labeling for such a finite constellation. We then discuss the relation 
between rate and average energy required to transmit the constellation points. Finally, we 
analyze the decoding of the finite constellation. 

We consider the sublattice A^ C Z 8 at level k in the partition chain and the eight- 
dimensional bounding region B = Bq AM , the four- fold Cartesian product of the bounding 
region of the Q-QAM symbols. For example, using square QAM constellations, we have an 
eight-dimensional hypercube as bounding region. 

Using Construction A, a constellation point x G A& n B can be written as 

x = 2u + c (26) 

where u = (m , . . . , uj) is a 8-dimensional vector with integer components and c = (co, ...,07) 
is a binary codeword of the corresponding code C&. With an abuse of notation we have lifted 
the binary components q G GF{2) to integers. 

Each pair of components (2w 2 j, 2w 2 j+i) is in £>qam, i = 0,1,2,3. Note that there are 
only Q/4 = 2 V ~ 2 distinct points from the Q-QAM that correspond to pairs of components 
(2u 2 i, 2w 2 j+i) G -Bqam- Since the components c, are either or 1, we are guaranteed that 
(x 2 i, X2i+i) G Bqam and x G B. 

We are now able to define the bit labels for the finite constellation as follows. We use 
q 2 = 8 — 2k bits to label the 2 q2 codewords of C^, through the generator matrix Gk, and 
q 3 = 4(77 - 2) bits to label the 2u G B. 

As an example, the E% encoder structure is shown in Fig. 2. Assuming 16-QAM symbols 
(77 = 4), we use ^3 = 8 bits to label the 2Z 8 n B points and q 2 = 4 bits to select one of the 
codewords of C 2 as 

c = (&!, b 2 , b 3 , h) G 2 . (27) 



11 



Note that there are 16 possible codewords of C2. 

We observe that the constellation Aj, C\ B requires higher energy to transmit the same 
number of bits as the uncoded Golden code constellation Z 8 n B', since B' C B. In particular 
we have that vol(B') /vo\(B)=N c the index of the sublattice A& over 2Z 8 . 

For example, encoding 12 bits with E% requires the average energy of the 16-QAM (E St \ = 
2.5), while encoding the same number of bits with the uncoded Golden code only requires the 
average energy of an 8-QAM (E Sj 2 = 1.5). Similarly, using 128-QAM (E s> i = 20.5) we encode 
24 bits with the Es lattice constellation, while with an uncoded Golden code constellation 
we can use 64-QAM with half the energy requirements (E s p = 10.5). 

Let us consider the decoding problem for A k fl B finite constellation. Sphere decoding 
of finite constellations requires high additional complexity to handle the boundary control 
problem, when the constellation does not have a cubic shape [11]. In order to avoid this 
problem we adopt the following strategy. 

Given the received point y, the lattice decoder first minimizes the N c = |Afc/2Z 8 | squared 
Euclidean distances in each coset 

£ = min ||yG> - 2WRu (i) || 2 , j = l,...,N c (28) 

J uO')ez 8 " " 

where y^-* = y — TiHc^\j = 1, . . . , N c , then makes the final decision as 

u = arg min (df) . (29) 

3 

Even if we perform N c sphere decoding operations, this strategy is rather efficient, since each 
decoder is working on 2Z 8 and visits on average an extremely low number of lattice points 
during the search. In fact, this is equivalent to working on the lattice Z 8 at a much higher 
signal-to-noise ratio. 

3.4 Performance of the Golden subcodes 

In order to compensate for the rate loss of any subcode, a constellation expansion is required, 
as noted in the previous section. For large QAM constellations, it can be seen that energy 
increases approximately by a factor of v2 (1.5dB) from one partition level to the next. Since 
the minimum determinant doubles at each partition level, we conclude that the asymptotic 
coding gain (6) is 1 (OdB). However, for small constellations, the energy does not double and 
some gain still appears. 

To illustrate the observations, we show the performance of Q and Q2 in Figs. 3 and 4, 
corresponding to different spectral efficiencies. In Fig. 3, we show the performance of Q 

12 



with 64-QAM symbols (4 x 6 = 24 bits per codeword) and Qi with 128-QAM symbols 
(4 x (7 — 2) + 4 = 24 bits per codeword), corresponding to a spectral efficiency of 12 bpcu. 
We can see that both codes have approximately the same codeword error rate (CER). This 
agrees with the expected asymptotic coding gain 

v^~/20.5 
las = — 1=1 = 1-02 — ► 0.1 dB. 

V^min/10.5 

Fig. 4 compares the performance of the Q with 8-QAM symbols (4x3 = 12 bits per codeword) 
and Q2 with 16-QAM symbols (4 x (4 — 2) + 4 = 12 bits per codeword), corresponding to 
the spectral efficiency of 6 bpcu. We can see that the Q2 outperforms Q by 0.7dB at CER of 
10 -3 , in line with the expected asymptotic coding gain 



v / 4W2.5 
l as = — ' = 1.2 — > 0.8 dB. 

V drain/ 1.5 

This small gap is essentially due to the higher energy of the 8-QAM 1 , for which E s ^ = 1.5 > 

2.5/V2. 

It is interesting to note that the Eg constellation is the densest sphere packing in dimension 
8. This implies that Q2 maximizes 



min Tr (JX') = min ||X|| 



12 
Xeg 2 ,x^o x&g 2 ,x^o 



among all subcodes of the Golden code. Code design based on this parameter is known as a 
trace or Euclidean distance design criterion [19, Sec. 10.9.3]. Our result shows how this design 
criterion becomes irrelevant even at low SNR, when using the Golden code as a starting point. 

4 Trellis Coded Modulation 

In this section we show how a trellis code can be used as an outer code encoding across 
the Golden code inner symbols X t , t = 1, . . . , L. We analyze the systematic design problem 
of this concatenated scheme by using Ungerboeck style set partitioning rules for coset codes 
[14-16]. The design criterion for the trellis code is developed in order to maximize A^ in , since 
this results in the maximum lower bound on the asymptotic coding gain of the GST-TCM 
over the uncoded system 

las > T . ■ = las- (30) 

V<Vm/-frs,2 



1 This is the Cartesian product of a 4-PAM and 2 2-PAM constellation. 
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We note that the asymptotic coding gain gives only a rough estimate of the actual coding gain. 
Nevertheless, it is currently the only means to obtain a tractable design rule for space-time 
TCM schemes [1] . We then show several examples of the above schemes with different rates 
and decoding complexity. We compare the performance of such schemes with the uncoded 
Golden code case. 

4.1 Design criteria for GST-TCM 

Encoder structure - In a standard TCM encoder the trellis encoder output is used to label 
the signal subset, while the uncoded bits select the signals within the subset and yield the 
so called parallel transitions in the trellis [19]. Fig. 5 shows the encoder structure of the 
proposed concatenated scheme. The input bits feed two encoders, an upper trellis encoder 
and a lower sublattice encoder. The output of the trellis encoder is used to select the coset, 
while the sublattice encoder will select the point within the coset. The trellis will have parallel 
transitions on each branch corresponding to the constellation points within the same coset. 

We consider two lattices Ae and A£ 0+ £ from the lattice partition chain in Table 1, such 
that Ae 0+ e is a proper sublattice of the lattice A^ , where £ denotes the relative partition level 
of A£ 0+ £ with respect to Ag . Let £q denote the absolute partition level of the lattice A^ . For 
example, with £ = 0,£ = 2, we have A^ = Z 8 and Ai 0+ £ = E 8 , with £ = 2,£ = 2, we have 
A lo = E 8 and A io+i = 2Z 8 . 

The quotient group A£ /A£ 0+ £ has order 

N c = \A i0 /A i0+i \ = A\ (31) 

which corresponds to the total number of cosets of the sublattice A£ 0+ £ in the lattice A^ . 

Let us consider a trellis encoder operating on qi information bits. Given the relative 
partition depth £, we need to select N c = 2 2e distinct cosets. If we consider a trellis code with 
rate R c = l/£, the trellis encoder must output 

n c = qi /R c = 2£ = log 2 (AQ bits, 

hence we can input qi = 2 bits. Since the trellis has 2 qi incoming and outgoing branches 
from each state, this choice is made to preserve a reasonable trellis branch complexity. The 
previous design, proposed in [18], had a much larger branch complexity. 

The n c bits are used by the coset mapper to label the coset leader Ci G [Ce /Ce 0+ e] ~ 
[Aio/AtQ+e]. The mapping is obtained by the product of the n c bit vector with a binary coset 
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leader generator matrix 



/h$ 



H, 



<■■! 



(.to) 



,(<o) 



hi 



!o+^-l) 
(^o+^-l) 



(32) 



/ 



where the rows are taken from (22). 

We assume that we have a total of Aq = q 1 + q 2 + q% input information bits. The lower 
encoder is a sublattice encoder for Ai 0+ £ and operates on the remaining q 2 + q^ information 
bits. The q 2 = 2 x (4 — £ — £q) bits label the cosets of 2Z 8 in Ai 0+ g by multiplying the following 
binary generator matrix 

( h[' 0+/) \ 



H r , 



h (t 0+ t) 



K 



(3) 



V 



.0) 



(33) 



/ 



which generates coset leader c 2 G [A^ 0+ ^/2Z 8 ]. We finally add both coset leaders of Ci and 
c 2 modulo 2 to get c'. The remaining g 3 = 4g — gi — g 2 bits go through 2Z 8 encoder and 
generate vector 2u as detailed in Appendix II. Finally, 2u is added to c' (lifted to have integer 
components) and mapped to the Golden codeword X t . 

We now focus on the structure of the trellis code to be used. We consider linear convo- 
lutional encoders over the quaternary alphabet Z 4 = {0, 1,2,3} with mod 4 operations. We 
assume the natural mapping between pairs of bits and Z4 symbols, i.e., — > 00, 1 — ► 01, 2 — ► 
10, 3 — > 11. Let (3 G Z 4 denote the input symbol and ai,...,ai G Z 4 denote the £ output 
symbols generated by the generator polynomials gi(D), . . . ge(D) over Z4. 

For example, Figure 6 shows a 4 state encoder with rate R c = 1/2 defined by the generator 
polynomials gi(D) = 1 and g 2 (D) = D. The trellis labels for outgoing and incoming branches 
listed from top to bottom. Figure 1 shows how the N c = 16 cosets can be addressed through 
a partition tree of depth 2. 

Labeling - Let us first consider the conventional design of the trellis labeling in a TCM 
scheme. We then show how this can be directly transferred to GST-TCM. The conventional 
TCM design criteria attempt to increase the minimum Euclidean distance d m \ n between code- 
words in the following way. 
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1. Use subconstellations with a larger minimum Euclidean distance d Ptrnin , known as intra- 
coset distance 

2. Label the parallel branches in the trellis with the points within the same subconstella- 
tion. 

3. Label the trellis branches for different states so that the partitions can increase the 
inter-coset distance <i Sjmin among code sequences. 

The aim of our GST-TCM design criteria is to maximize the lower bound A' min in (8). The 
additive structure of the A' min enables to use the same strategy that is used for the Euclidean 
distance in conventional TCM design. Let 

A p = 2 e ° +e S mia (34) 

denote the minimum determinant on the trellis parallel transitions corresponding to the 
Golden code partition Ae 0+ e of absolute level £q + L Let 

to+L'-l 

A s = min V det(X t X ] t ) (35) 

X^02x2L ~f 
t — to 

denote the minimum determinant on the shortest simple error event, where L' is the length 
of the shortest simple error event diverging from the zero state at t and merging to the zero 
state at U —t + V . We can increase A s in (35) either by increasing V or by increasing the 
det(XtXj) terms. Fig. 7 shows the possible inter coset distances contributing to (35). 

Note that once U is fixed, Ungerboeck's design rules focus on the first and last term only. 
The lower bound A^ in (8) is determined either by the parallel transition error events or 
by the shortest simple error events in the trellis, i.e., 

A' min = min{A p , AJ > min (A p ,mindet(X to X t t J + mindet(X ti X/j 1 . (36) 

I x to X H ) 

The corresponding coding gain will be 

las = min {las ( A p) > las ( A s) } ■ (37) 

Therefore, we can state the following: 

Design Criterion - We focus on A^ in . The incoming and outgoing branches for each state 

should belong to different cosets that have the common father node as deep as possible in 
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the partition tree. This guarantees that simple error events in the trellis give the largest 
contribution to A' min . 

In order to fully satisfy the above criterion for a given relative partition level £, the 
minimum number of trellis states should be N c = 2 2e . In order to reduce complexity we will 
also consider trellis codes with fewer states. We will see in the following that the performance 
loss of these suboptimal codes (in terms of the above design rule) is marginal since A p is 
dominating in (36). Nevertheless, the optimization of A s yields a performance enhancement. 
In fact, maximizing A s has the effect of minimizing another relevant PWEP term. 

Decoding - Let us analyze the decoding complexity. The decoder is structured as a 
typical TCM decoder, i.e. a Viterbi algorithm using a branch metric computer. The branch 
metric computer should output the distance of the received symbol from all the cosets of 
Ai 0+ £ in Ae . The decoding complexity depends on two parameters 

• N c the total number of distinct parallel branch metrics 

• the number of states in the trellis. 

We observe that the branch metric computer can be realized either as a traditional sphere 
decoder for each branch or as single list sphere decoder which can keep track of all the cosets 
at once. 

4.2 Code Design Examples for TCM 

In this subsection, we give four examples of GST-TCM with different numbers of states using 
different partitions A^ /A^ 0+ £. We assume a frame length L = 130 in all examples. All related 
parameters are summarized in Table 2. 

The trellis code generator polynomials have been selected by an exhaustive search among 
all polynomials of degree less than four with quaternary coefficients. The selection was made 
in order to satisfy the design criterion (when possible) and to maximize A Sjmin . 

We first describe the uncoded Golden code schemes, which are used as reference systems 
for performance comparison. In the standard uncoded Golden code, four Q-QAM information 
symbols are sent for each codeword (2), for a total of 4q information bits, where q = log 2 (Q). 
When q is not integer, we have to consider different size QAM symbols within the same 
Golden codeword, as shown in the following examples. 

• 5bpcu — A total of 10 bits must be sent in a Golden codeword: the symbols a and c are 
in a 4-QAM (2bits), while the symbols b and d are in a 8-QAM (3bits). This guarantees 
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that the same average energy is transmitted from both antennas. In this case we have 
E St2 = (0.5 + 1.5)/2 = 1 and q = 2.5 bits. 

• 6bpcu — A total of 12 bits must be sent in a Golden codeword: the symbols a, b, c, d 
are in a 8-QAM (3bits). In this case we have E S)2 = 1.5 and q = 3 bits. 

• 7bpcu — A total of 14 bits must be sent in a Golden codeword: the symbols a and 
c are in a 8-QAM (3bits), while the symbols b and d are in a 16-QAM (4bits). This 
guarantees that the same average energy is transmitted from both antennas. In this 
case we have L Sj2 = (1.5 + 2.5)/2 = 2 and q = 3.5 bits. 

• lObpcu — A total of 20 bits must be sent in a Golden codeword: the symbols a, b, c, d 
are in a 32-QAM (5bits). In this case we have E S)2 = 5 and q = 5 bits. 

Example 1 — We use a two level partition Lg/2Z 8 . The 4 and 16 state trellis codes 
using 16-QAM (E Sj i = 2.5) gain 2.2dB and 2.5dB, respectively, over the uncoded Golden 
code (E Sy2 = 1) at the rate of 5bpcu. 

The two level partition (£ = 2 and £ = 2) has a quotient group L 8 /2Z 8 of order N c = 16. 
The quaternary trellis encoders for 4 and 16 states with rate R c = 1/2, have q\ = 2 input 
information bits and n c = 4 output bits, which label the coset leaders using the generator 

(2) (2) (3) (3) 

matrix with rows h-^ , h 2 , h^ , h 2 . The trellis structures are shown in Fig. 6 and Fig. 8, 
respectively. The sublattice encoder has q 2 = and q^ = 8 input bits, giving a total number 
of input bits per information symbol q = (q\ + q 2 + qz)/^ = 10/4 = 2.5bits. 

In Fig. 6, for each trellis state, the four outgoing branches with labels a±, a 2 , corresponding 
to input /3 = 0,1,2,3, are listed on the left side of the trellis. Similarly, four incoming trellis 
branches to each state are listed on the right side of the trellis structure. In this case, ci\ 
chooses the cosets from Lg in A = E% and a 2 chooses the cosets from A^ = 2Z 8 in L§. 

We can observe that the four branches merging in each state belong to four different cosets 
of 2Z 8 in Lg, since ci\ is constant and a 2 varies (see Fig. 1). This guarantees an increased 
A^ in . On the other hand, the four branches departing from each state are in the cosets of Lg 
in L'g. This does not give the largest possible A^ in since ct\ varies. Looking for example at 
the zero state, there are four outgoing branches labeled by «i = 0, 1, 2, 3 and a 2 is fixed to 
0, while the four incoming branches are labeled by ct\ =0 and a 2 = 0, 1, 2, 3. 

This results in a suboptimal design since it can not guarantee that the outgoing trellis 
paths belong to cosets that are in the deepest level (2Z 8 ) of the partition tree. We can see that 
the shortest simple error event has a length of V = 2, corresponding to the state sequence 
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Table 2: Summary of the parameters of GST-TCM Examples 

1 — > and labels 10, 01. This yields the lower bound on the asymptotic coding gain 

v /min(165 min , 4<5 min + 85 min )/E. 



la 



J s,l 



1.4 dB. 



m 



V^mm/Esp 

The above problem suggests the use of a 16 state encoder. In Fig. 8, we can see that the 
shortest simple error event has length L' — 3 corresponding to the state sequence 0^1^ 
4^0 and labels 01, 10, 01. In general, we have that the first output label a.\ is fixed for both 
outgoing and incoming states. This guarantees both incoming and outgoing trellis branches 
from each state belong cosets with the deepest father nodes in the partition tree. This yields 
the lower bound on the corresponding asymptotic coding gain 

, ^mm(165 min , 85 min + 45 min + 85 min )/E sA 

las = m—n? * 2 - 0dB - ( 39 ) 

Compared to 4 state, the 16 state GST-TCM has a higher decoding complexity. It requires 
64 lattice decoding operations in each trellis section, while the 4 state GST-TCM only requires 
16 lattice decoding operations. Note that each lattice decoding operation is working on 2Z. 

Performance comparison of the proposed codes with the uncoded scheme with 5 bpcu is 
shown in Fig. 9. We can observe that a simple 4 state GST-TCM outperforms the uncoded 
scheme by 2.2dB at the FER of 10~ 3 . The 16-state GST-TCM outperforms the uncoded case 
by 2.5dB at the FER of 10~ 3 . 

Example 2 — We use a two level partition Z 8 / 'E% (£$ = and £ = 2). The 4 and 16 state 
trellis codes using 16-QAM (E s> \ = 2.5) gain S.OdB and 3.3dB, respectively, over uncoded 
Golden code (E s2 = 2) at the rate of 7 bpcu. 
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As in Example 1, we can see that the 4 state trellis code is suboptimal since it can not 
guarantee that both the incoming and outgoing trellis paths belong to cosets that are in the 
deepest level (E s ) of the partition tree. In contrast, the 16 state trellis code always has a 
fixed label a± in each state. This fully satisfies the proposed design criteria. However, the 16 
state code requires higher decoding complexity. Finally, we have 



/ _ A/min(4^ min , 6 min + 25 n ~)/E S)1 _^ ^ 

V$mm/E St 2 

for the 4 state GST-TCM and 

, y/min(45 min , 25 min + 5 min + 25 min )/E sl 
las = ~ /t — IJ? ► 2 -0 dB (41) 

V"min/- C 's,2 

for the 16 state GST-TCM. 

Performance of both the proposed TCM and uncoded transmission (7 bpcu) schemes is 
compared in Fig. 10. It is shown that the proposed 4 and 16 state TCMs outperform the 
uncoded case by 3.0dB and 3.3dB at the FER of 10 -3 . 

Compared to Example 1, this GST-TCM has a higher decoding complexity. It requires 
N c = 256 lattice decoding operations of 2Z 8 in each trellis section or 16 lattice decoders of 
cosets of Es- 

Example 3 — We use a three level partition Z 8 / L% (£q = and £ = 3). The 16 and 
64 state trellis codes using 16-QAM (E s ^i = 2.5) gain 4-2 and 4-3 dB, respectively, over an 
uncoded Golden code (E Sj2 = 1.5) at the rate of 6 bpcu. 

In Fig. 11, for each trellis state, the four outgoing branches with labels a\, 0,2,0,3, corre- 
sponding to input j3 = 0, 1,2,3, are listed on the left side of the trellis. Similarly, the four 
incoming trellis branches to each state are listed on the right side of the trellis structure. In 
such a case, o\ chooses the cosets from D\ in A = Z 8 , a 2 chooses the cosets from Eg in D\, 
and «3 chooses the cosets from A^ = L§ in Eg. 

The four branches departing from each state belong to four different cosets of Lg, since 
a\ and «2 are constant, while 03 varies. On the other hand, the four branches arriving in 
each state are cosets of E 8 . This does not yield the largest possible A' min , since only o\ is 
fixed but «2 varies. This results in a suboptimal design since it can not guarantee that both 
incoming and outgoing trellis paths belong to cosets that are in the deepest level (L 8 ) of the 
partition tree. 

We can see that the shortest simple error event has a length of V = 3 corresponding to 
the state sequence — > 1 — ► 4 — ► and labels 001, 100,011. This yields the lower bound of 
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the corresponding asymptotic coding gain 

, A/min(85 min , 45 min + 5 min + 25 min )/E sl 

las = K—IT? ' 2.0 dB. ^ 

The above problem suggests the use of a 64 state encoder. In Fig. 8, we can see that the 
shortest simple error event has length L' — 4 corresponding to the state sequence 0^1^ 
4 — ► 16 — ► and labels 001, 100,010,001. Note that now the output labels ai,«2 are fixed 
for all outgoing and incoming states. This guarantees both incoming and outgoing trellis 
branches from each state belong to the cosets that are deepest in the partition tree. This 
yields the lower bound of the corresponding asymptotic coding gain 



/ \/min(8(5 m i n , 4<5 min + <5 min + 2£ min + 45 min )/£' S)1 

las = /T /F -» 2 - 6 dB - l 4 3) 

V°min/ J^s,2 

Performance of the proposed codes and the uncoded scheme with 6 bpcu is compared in 
Fig. 13. We can observe that a 16 state GST-TCM outperforms the uncoded scheme by 4.2 
dB at the FER of 10~ 3 . The 64 state GST-TCM outperforms the uncoded case by 4.3 dB at 
FER of 10~ 3 . 

Note that in this Example with 16 states, we have the same decoding complexity as in 
the previous example with 16 states. 

Example 4 — We use the same partition as in Example 3. The 16 and 64 state trellis 
codes using 6^-QAM (E St i = 10. 5j gain 1.5 dB, in both cases, over an uncoded Golden code 
(E S} 2 = 5) at the rate of 10. 

The trellis structures are shown in Figures 11 and 12, respectively. This yields the lower 
bounds of the corresponding asymptotic coding gain 



, v /min(85 min , 4<J min + 5~ + 26 min )/E sA 
las = 7% — IJ? * 1-U dB - ( 44 J 

V<Vin/-frs,2 



for the 16 state GST-TCM and 



/ A/mm(8<5 min , Ad mhl + 5~ + 2<J min + A5 min ) E S)l 

las = ~ rx IT? ' L3 dB - ^ 45 ) 

Vt ) min/-C's,2 

for the 64 state GST-TCM. 

Fig. 14 compares the performance of above codes at the spectral efficiency of 10 bpcu with 
64 QAM signal constellation for GST-TCM and 32 QAM signal constellation for uncoded 
case, respectively. It is shown that a 16 state GST-TCM outperforms the uncoded scheme by 
1.5dB at the FER of 10~ 3 . The 64 state code has similar performance as the 16 state code. 
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Remarks: For GST-TCM, we can see that the lower bound j' as on 7 as is only a rough 
approximation of the true system performance. This is due to the following reasons: 

1. 7 as is based on the worst case pairwise error event which is not always the strongly 
dominant term of the full union bound in fading channels; 

2. the lower bound 7^ on 7 as can be loose due to the determinant inequality; 

3. the multiplicity of the minimum determinant paths is not taken into account. 

Looking at Table 2, we observe that the true coding gain is better approximated by a 
combination of 7^ S (A P ) and 7^ S (A S ) in (37), rather than •y' as . 

5 Conclusions 

In this paper, we presented GST-TCM, a concatenated scheme for slow fading 2x2 MIMO 
systems. The inner code is the Golden code and the outer code is a trellis code. Lattice 
set partitioning is designed specifically to increase the minimum determinant of the Golden 
codewords, which label the branches of the outer trellis code. Viterbi algorithm is applied 
in trellis decoding, where branch metrics are computed by using a lattice sphere decoder. 
The general framework for GST-TCM design and optimization is based on Ungerboeck TCM 
design rules. 

Simulation shows that 4 and 16 state GST-TCMs achieve 3dB and 4.2dB performance 
gains over uncoded Golden code at FER of 10" 3 with spectral efficiencies of 7 bpcu and 6 
bpcu, respectively. 

Future work will explore the possibility of further code optimization, by an extensive 
search based on the determinant distance spectrum, which gives a more accurate approxima- 
tion of the true coding gain. 

Appendix I: Proof of (20) 

Let us consider a subcode Qi of the Golden code Q obtained by Q2 = {XB 2 ,X G Q}, where 
B is given in (19) and X is given as 



X 



a (a + bQ) a (c + dQ) 
id (c + d6) a (a + W) 



22 



(46) 



where we omit the normalization factor -4= for simplicity. After manipulations, we obtain the 
subcode Q 2 codeword 

= XB 2 (47) 



9ii 9i2 

921 922 



where 

0ii = [-l-i2(l + 0)]a+(-0 + i20)&+(-0 + z)c+(-l-0 + z'0)d, 

921 = [_0_i(i + 0)] o +(i + i0)&+[0_j20]c+[-l-i(2 + 20)]d, 

g 12 = [-1 -9 + i9] a + (6 - i) b + (-26 - iff) c + (-2 - 29 + i) d, 

g 22 = [-l-i2(l + 9)]a+(-9 + i29)b+{-l + 9 + i)c+(-l-9 + i9)d, 

where a, b,c,d G Z[i]. Note that 9 = 1 — 9 and 2 = 9 + 1. Vectorizing (47) yields 

wee (Xfi 2 ) = Ru (48) 

where 

wee (IB 2 ) = [3ft (<7n) , 3 (g n ) , 3ft (021) , 3 (g 21 ) , 3ft (g 12 ) , Q (g l2 ) , 3ft (g 22 ) , 3 ( to )] r (49) 
-1 2(1 + 0) -9 -29 -9 -1 -1-9 -9 



R 



-2(1+0) -1 29 -9 1 -9 

-9 1+9 1-9 9_ 29 

-l-9_ -9_ 9_ 1 -29 9_ 

-1-9 -9 9 1 -29 9 

9 -1-9-19 -9 -29 

-1 2(1 + 0) -9 -29 -1 + 9 -1 

-2(1+0) -1 20-0 1 -l + l 



and 



-1-0 
-1 2 + 20 

-2 - 20 -1 

-2 - 20 -1 

1 -2 -20 
-1-0 -0 

9 -1-0 

r 



(50) 



u = [3ft (a) , 3 (a) , 3ft (6) , 3 (6) , 3ft (c) , 3 (c) , 3ft (d) , 3 (d)f . (51) 

The matrix R can be written as 

R = RM. 

Substituting the matrix R, defined in (16), into above equation yields the lattice generator 

matrix 

"-2 1 1 0-10 
-1-2 1 0-1 
1 0-11-10-10 

1-1-10-10-1 
0-10 1-11-10 

1 0-10-1-10-1 
1 0-10-21 

-10 0-1-1-2 



M 



R J R 
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By conducting LLL lattice basis reduction, we found that the lattice generator matrix M has 
the minimum squared Euclidean distance d^ in = 4. Since the determinant of M is 16, the 
packing density coincides with the one of E 8 , which is the unique optimal sphere packing in 
8 dimension. Note that there exist multiple lattice generator matrices for Eg lattices, all of 
which have the same properties as above [21]. Therefore we conclude that the subcode Q 2 of 
the Golden code Q, when vectorized, corresponds to the E% lattice points. A similar approach 
can be used for the other lattices in the partition. 

Appendix II: Construction A and Set Partitioning 

In this Appendix we review the basic principles of Construction A and lattice set partitioning 
by coset codes following a simple example based on the lattice Z 2 . The general theory 
underlying these techniques is described in detail in [15,16,21]. We assume that the reader 
is familiar with the basic facts of group theory, in particular we will use the notions of group, 
subgroup, quotient group, and group isomorphism [20]. 

Construction A establishes a correspondence between an integer lattice and a linear binary 
code [21]. In particular given an integer lattice A we obtain all the codewords of a linear 
binary code C by taking all components of the lattice points mod 2, we write: 

C = A mod 2 (52) 

On the other hand given a linear binary code C = (n, k, d) with codewords Cj we can write: 

A = 2Z n + C = (J (2Z n + a) (53) 

This construction provides also a simple relation between the minimum Hamming distance 
d of the code and the minimum Euclidean distance between any two lattice points. For this 
reason it can be used to design dense sphere packing lattices [21]. For our purposes we will 
use Construction A as means to handle the set partitioning and to bit-label the lattice points 
within a finite constellation. 

As an example, let us consider a two-dimensional integer lattice Z 2 , depicted in Fig. 15. 
In such a lattice, the checkerboard lattice D 2 is a sublattice of Z 2 containing all integer vectors 
(x, y) such that x + y is even. Using the repetition code of length two C = {(00), (11)} we 
write 

D 2 = 2Z 2 + C = [21? + (00)] |J [2Z 2 + (11)] 
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This is illustrated in Fig. 15, where the squares denote the D 2 lattice that is the union of the 
2Z 2 lattice (light squares) and its translate (dark squares). 

Similarly, given the universe code C = (2, 2, 1) = {(00), (01), (10), (11)}, we can write 

1} = 2Z 2 + C 

Given the linear code C, the dual code C x is defined such that C © C 1 - = Co, i.e., all the 
binary sums of a codeword from C with a codeword from C x yield all the universe codewords. 
In our example, C = {(00), (11)} has a dual code C L = {(00), (01)}. 

Linearity of the codes is related to the additive group structure and enables to interpret 
codes and subcodes as groups and subgroups. In turn, this lets us define a quotient group 
between a code and its subcode. 

For example given that C C Co we can write the quotient group as the set of two cosets 
of the subgroup C, i.e., C /C = {C + (00), C + (01)}. 

A well known property of abelian groups tells us that the quotient group has itself a group 
structure. The quotient group operation © between two cosets is defined as (C + Ci) © (C + 
C2) = C + (ci + C2). This implies that the quotient group is isomorphic to the so called 
quotient code denoted by [Cq/C] and defined as the set of all the coset leaders. If Co is the 
universe code then the quotient code coincides with the dual code, i.e., 

[C /C] = C x (54) 

In our example [C /C] = {(00), (01)}. 

Let us consider a lattice A and sublattice A C A . Thanks to the group structure of 
lattices, we can define the quotient lattice A /A as the set of all distinct translates (or cosets) 
of A, i.e., 

A /A = {A + Xi } (55) 

where Xj are the translation vectors or coset leaders. Let [A /A] denote the set of all the 
coset leaders then we write 

A = A + [A /A] (56) 

If Co and C are the corresponding binary codes defined by Construction A, we have the 
following group isomorphism 

C /C ~ A /A (57) 

Note that the quotient group defines a partition of Co into disjoint cosets of the same size N c = 
|Co/C|, where | ■ | denotes the cardinality of the set. Thanks to the above isomorphism, the 
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index of the sublattice in the lattice is finite, i.e., |A /A| = N c . Considering the fundamental 
volume of a lattices defined as vol (A) = det(MM T ) 1 / 2 , where M is the lattice generator 
matrix, we have vol(A)/vol(A ) = N c . 

Consider the sequence of nested lattices 2Z n C A C Ao C Z™. Each coset of the quotient 
lattice can be identified by a coset leader which is related to the quotient code as follows 

[A /A] mod 2 = [C Q /C] and [A/2Z n ] mod 2 = C (58) 

This is due to the fact that the lattice 2Z™ is obtained by Construction A with the (n, 0) 
code, containing only the all zero codeword. The partitions of the basic lattice A can be 
written as 

A = A+[A /A] = A+[Co/C] (59) 

In our example, we first partition Z 2 into two cosets: the sublattice D 2 and its translate 
D 2 + (01) (squares and circles in Fig. 15, respectively). 

Z 2 = D 2 + C 1 = D 2 + [Z 2 /D 2 ] 

The number of partitions equals to the index of the sublattice D 2 in Z 2 and equals N c = 
\C X \ = 2. We can further partition each coset by partitioning D 2 into two cosets. The 
sequence of nested lattices Z 2 D D 2 D 2Z 2 induces a partition chain 

Z 2 = 2Z 2 + C x + C = 2Z 2 + [1?/D 2 ] + {D 2 /2I?} 

which can be represented by the two level binary partition tree in Fig. 16. 

We observe how Construction A yields a simple bit labeling of a finite constellation S = 
Afl£> carved from the infinite lattice with shaping region B. In particular, since A = 2Z n + C, 
with C = (n, k) generated by code generator matrix G, the constellation points are written 
as x = 2u + c, with 2u e 2Z n n B and c e C. 

In order to label the constellation points x, we form the bit label vector b as the concate- 
nation of two parts h 2 and b3, i.e, b = (b2, h^). The first part h 2 has k bits and indexes the 
codeword c = b 2 G. The second part b3 labels the integer vectors u, such that 2u + c e B. 
Note that the number of bits in b 3 depends on the size of B. When B has a cubic shape, we 
can apply a Gray labeling to each component of u. 

For example, Fig. 17 shows the labeling of an 8 point constellation carved from D 2 , where 
one bit is used to select one on the two codewords (00) and (11), while the other two bits to 
select one of the four points in 2Z 2 n B. 

26 



Finally, we consider the labeling of the entire finite constellation carved from A C Z n . 
In order to follow the partition into cosets induced by A C A , we use (59) to get 

A = A + [A /A] + [A/2Z n ] = 2Z" + [C /C] + C (60) 

In particular, we add b x information bits, which are used to label the codewords of the 
quotient code [C /C]. So the final bit label is b = (bi,b 2 ,b 3 ). 

Fig. 18 shows the labeling of the 16-QAM obtained by set partitioning corresponding 
to Fig. 15. The extra bit bi selects one of the two codewords of the dual code (00) and 
(01), while b 2 and b 3 are the same as in Fig. 17. This labeling technique was first proposed 
by Ungerboeck and we can observe how the overall labeling is not a Gray labeling of the 
16-QAM. 
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Figure 2: The E 8 encoder structure resulting in a E shaped finite constellation. 
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Figure 3: Performance of Z 8 Golden code with 64-QAM and E% Golden subcode with 128- 
QAM (12bpcu). 
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Figure 4: Performance of Z 8 Golden code with 8-QAM and Eg Golden subcode with 16-QAM 
(6bpcu). 
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Figure 5: General encoder structure of the concatenated scheme. 
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Figure 6: The 4-state encoder with gi(D) = 1 and ^(-D) = D and corresponding trellis 
diagram. Labels on the left are outgoing from each state clockwise, labels on the right are 
incoming counterclockwise. 



31 




Figure 7: Inter coset distances for a two level partition tree 
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Figure 8: The optimal 16 states trellis corresponding to the generators gi{D) = D and 
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are incoming counterclockwise. 
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Figure 9: Performance comparison of a 4-state trellis code using 16-QAM constellation and 
an uncoded transmission at the rate 5 bpcu, A = E$, Ae = 2Z 8 , £ = 2 (see Example 1). 
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Figure 10: Performance comparison of 4 and 16 state trellis codes using 16-QAM constellation 
and an uncoded transmission at the rate of 7 bpcu and A = Z 8 , Ap = Eg, £ = 2 (see Example 
2). 
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Figure 11: The 16 states trellis corresponding to the generators g\{D) = D, ^(-D) = D 2 , 
and g%(D) = 1 + D 2 . Labels on the left are outgoing from each state clockwise, labels on the 
right are incoming counterclockwise. 
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Figure 12: The optimal 64 states trellis corresponding to the generators gi(D) = D, g2{D) = 
D 2 , and g% (D) = 1 + D 3 . Labels on the left are outgoing from each state clockwise, labels on 
the right are incoming counterclockwise. 
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Figure 13: Performance comparison of 16 and 64 state trellis codes using 16-QAM constel- 
lation and an uncoded transmission at the rate of 6 bpcu and A = Z 8 , A^ = L$,£ = 3 (see 
Example 3). 
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Figure 14: Performance comparison of 16 and 64 state trellis codes using 64-QAM constel- 
lation and an uncoded transmission at the rate of 10 bpcu and A = Z 8 ,A^ = L$,£ = 3 (see 
Example 4). 
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Figure 15: Example of Construction A and set partitioning of Z 2 
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Figure 16: The two-way partition tree of Z 2 
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Figure 17: Labeling the finite constellation carved from D 2 
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Figure 18: Labeling the finite constellation carved from Z 2 using the two level set partitioning 
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